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We report the first observation of single top quark production using pp collision data with 
= 1.96 TeV collected by the CDF II and DO detectors at Fermilab. The significance of 
both the observed CDF and DO data is 5.0 standard deviations, and the expected sensitivity is 
in excess of 5.9 and equal to 4.5 standard deviations, respectively. The single top production 
cross section and the CKM matrix element value \Vtb\ have been measured. 

1 Introduction 

The establishment of the presence of the electroweak production of single top quarks in pp 
collisions is an important goal of the Tevatron program. The reasons for studying single top 
quarks are compelling: the production cross section is directly proportional to the square of the 
CKM matrix^ element | Vtb \ , and thus a measurement of the rate constrains fourth-generation 
models, models with flavor-changing neutral currents, and other new phenomena^. Furthermore, 
understanding single top quark production provides a solid anchor to test the analysis techniques 
that are also used to search for Higgs boson production and other more speculative phenomena. 

In the SM, top quarks are expected to be produced singly through t-channel or s-channel 
exchange of a virtual W boson. This electroweak production of single top quarks is a really 
difficult process to measure because the expected combined production cross section {as+t ~ 
2.9 pb ) is much smaller than those of competing background processes. Also, the presence 
of only one top quark in the event provides fewer features to use in separating the signal from 
background, compared with measurements of top pair production {tt), which was first observed 
in I995EI. 

Both the CDF and DO collaborations have published evidence for single top quark production 
at significance levels of 3.7 and 3.6 standard deviations, respectively'^^. This article describes 
the latest analysis done using data collected with the CDF IlEl (with 3.2 fb^i) and BO^ (with 
2.3 fb^^) detectors at the Tevatron and reports observation of single top quark productionQSHI]. 

Since the two collaborations use similar analysis techniques, the next sections apply to both 
of the analyses unless otherwise is stated. 

2 Event Selection and Backgrounds 

For the analyses shown here, we assume that single top quarks are produced in the s- and t- 
channel modes with the SM ratio, and that the branching fraction of the top quark to Wb is 
100% (corresponding to I Vtb I >> 1 |). For most of the analysis channels, we seek events 



in which the W boson decays leptonically in order to improve the signal-to-background ratio 
s/b. 

The basic event selection is based on selecting £-|--E*T+jets events, where i is an explicitly 
reconstructed electron or muon from the W boson decay. This lepton is required to be isolated 
from nearby jets and to have large transverse momentum. The presence of high missing trans- 
verse energy {]^t) and at least two energetic jets is also required. At least one of the jets has to 
be identified as containing a B hadron. 

The background has contributions from events in which a W boson is produced in association 
with one or more heavy flavor jets, events with mistakenly 6-tagged light-flavor jets, multijet 
events (QCD), tt and diboson processes, as well as Z+iet events. 

The expected number of £-|-^r+jets events, in CDF, as a function of the number of jets for 
the signal and each background process is shown in Fig. [1] (left). The DO yields for events with 
2, 3 or 4 jets are shown in Fig. [T] (right). From these figures, it is clear that single top signal is 
hidden under huge and uncertain backgorunds which make counting experiments impossible. 
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Figure 1; Left: Expected number of CDF £+^T+jets events as a function of the number of jets for the signal and 
each background process. Right: DO yields for events with 2, 3 and 4 jets. 



3 Multivariate Analysis 



To overcome these challenges, a variety of multivariate techniques for separating single top events 
from the backgrounds have been developed as described below. 



3.1 Likelihood Function (LF) 

This technique is used only by the CDF collaboration. A projective likelihood technique ^^^l 
is used to combine information from several input variables to optimize the separation of the 
single top signal from the backgrounds. Two likelihood functions are created, one for two jet 
events, and one for three jet events with 7 and 10 input variables, respectively. Some of the 
input variables used are: the total scalar sum of transverse energy in the event Ht, Q x r]^^, 
the dijet mass Mjj, cos0;*-^^and the t-channel matrix element. 

A new separate search for the single top in the s-channel is also done using this technique 
(LFS). In this case, the likelihood function is optimized to be sensitive to the s-channel process, 
using the subset of the £-|--£*T+jets sample with two 6-tagged jets. 



3.2 Neural Networks (NN) 



This approach employs neural networks which combine many variables into one more power- 
full discriminant and have the general advantage that correlations between the discriminating 
input variables are identified and utilized to optimize the separation power between signal and 
background. DO uses Bayesian NN (BNN), which average over hundreds of networks for each 
analysis channel to obtain better separation. 

3.3 Matrix Elements (ME) 

The ME method relies on the evaluation of event probability densities for signal and background 
processes based on calculations of the standard model differential cross sections^. We construct 
these probability densities for each signal and background process for each event given their 
measured quantities x by integrating the appropriate differential cross section da[y)/dy over the 
underlying partonic quantities y, convolved with the parton distribution functions (PDFs) and 
detector resolution effects. 

The event probability densities are combined into an event probability discriminant: EPD = 
Psignai / (Psignai + Pbackground) ■ To better classify signal cveuts that contain b jets, the CDF 
collaboration incorporates the output of a neural network jet-flavor separator^I^ into the final 
discriminant. DO applies the NN tagging probability to each jet and weights all combinations 
appropriately. 

3.4 Boosted Decission Trees (BDT) 

The BDT discriminant uses a decision tree method that applies binary cuts iteratively to classify 
events The discrimination is further improved using a boosting algorithm ^'^ . The BDT 
discriminant uses over 20 input variables in the case of CDF and 64 in DO. Some of the most 
sensitive are the neural-network jet-flavor separator (only in CDF), the invariant mass of the 
(.vh system M^^i), the total scalar sum of transverse energy in the event i^Ti Q ^ the dijet 
mass Mjj, and the transverse mass of the W boson. 

3.5 E!t+ jets (MJ) 

The MJ analysis is a new analysis in CDF designed to select events with and jets, while 
vetoing events selected by the (. -|--E*T+jet analyses. It accepts events in which the W boson 
decays into r leptons and those in which the electron or muon fails the lepton identification 
criteria. 

The advantage of this analysis is that it is orthogonal to the £-|--E'T+jets analysis described 
above, increasing the signal acceptance by ~30%. The disadvantage is the huge instrumental 
background due to QCD events in which mismeasured jet energies produce large Mt aligned in 
the same direction as jets. To reduce this background, a neural network is used removing 77% 
of the QCD background while keeping 91% of the signal acceptance. 

Finally, the MJ discriminant uses a neural network to combine information from several 
input variables. The most important variables are the invariant mass of the Mt and the second 
leading jet, the scalar sum of the jet energies, the-E'T) and the azimuthal angle between the^T 
and the jets. 

3.6 Combination 

DO combines the ME, BNN and BDT channels using a bayesian neural network. The three 
discriminant outputs from each analysis channel are used as inputs to a Bayesian neural network. 



obtaining a single discriminant output for each channel. As a cross-check, the Best Linear 
Unbiased Estimator (BLUE)f2Il is used. 

CDF combines the LF, ME, NN, BDT, and LFS channels using a super-discriminant (SD) 
technique. The SD method uses a neural network trained with neuro-evolution'^ to separate 
the signal from the background taking as inputs the discriminant outputs of the five analyses 
for each event. A simultaneous fit over the two exclusive channels, MJ and SD, is performed to 
obtain the final combined results (see next Section). 

For illustrative purposes, Fig.[2]shows the distributions of the £+^T'+jets discriminants result 
of the combination for CDF (left) and DO (right). 



CDF Run II, L = 3.2fb' 



D0 Single Top 2.3 fb~ 



Data i 
tb+tqb 



IV+jets 
ff 

Multijets 



Signal Region 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

Neural Network Output 



0.4 0.6 0.8 
Discriminant Output 



Figure 2: Left (Riglit): CDF (DO) discriminant output distribution of tlie combined analysis (£+^T+jets analyses 

only in tiie CDF case). 



4 Statistical Methods 

The cross section is measured using a Bayesian binned likelihood technique!^ assuming a flat 
non-negative prior in the cross section and integrating over the systematic uncertainties. The 
measured cross section is quoted as the position of the peak of the posterior density distribution 
and the shortest interval containing 68% of the integral of the posterior is used to set the ±1 
sigma uncertainties. 

The significance, in CDF, is calculated as a j>-value^^, which is the probability, assuming 
single top quark production is absent, that —2lnQ = —2 In (p(data|s + 6)/p(data|6)) is less than 
that observed in the data. 

DO also measures the significance as a p-value but in a different way. An ensemble of 
pseudo-datasets without signal contribution are generated and the significance is defined as the 
fraction of these background-only pseudo-datasets with a cross section equal to or higher than 
the measured one. 

In both cases, the p- value is then converted into a number of standard deviations using the 
integral of one side of a Gaussian function. 



5 Cross-checks 



Before investigating the sample of selected events, both collaborations check the modeling of 
the distributions of each input variable and the discriminant outputs in data control samples 
depleted of signal. These are the i + 6-tagged four-jet sample, which is enriched in tt events, 
and the two- and three-jet samples in which there is no 6-tagged jet. The latter have high 
statistics and are enriched in VF-|-jets and QCD events with kinematics similar to the 6-tagged 
signal samples. The data distributions in the control samples are described well by the models. 

6 Systematics 

All sources of systematic uncertainty are included and correlations between normalization and 
discriminant shape changes are considered. Uncertainties in the jet energy scale, 6-tagging 
efficiencies, background modeling, lepton identification and trigger efficiencies, the amount of 
initial and final state radiation, PDFs, and factorization and renormalization scale have been 
explored and incorporated in all individual analyses and the combination. 

7 Results 

Table [1] lists the cross sections and significances for each of the component analyses and the com- 
bination for each collaboration. The excess of signal-like events over the expected background 
is interpreted as observation of single top production with a p-value of about 3.10 x 10~^ and 
2.5 X 10~^ for CDF and DO respectively, corresponding in both cases to a signal significance of 
5.0 standard deviations. The sensitivity in CDF is defined to be the median expected signifi- 
cance and is found to be in excess of 5.9 standard deviations. DO defines it as the fraction of 
background-only pseudo-datasets that have a measured cross section equal to or larger than the 
SM predicted cross section value and the obtained value is 4.5 standard deviations. 

CDF finds a value of the combined s-channel and t-channel cross sections of 2.3^q5 pb 
assuming a top quark mass of 175 GeV/c^. DO finds a value of 3.94 it 0.88 pb assuming a top 
quark mass of 170 GeV/c^. 

Since the CKM matrix element |Vtbp is proportional to the cross section, its value can be di- 
rectly measured. From the cross section measurement at rat = 175 GeV/c^, CDF obtains \Vtb\ = 
0.91±0.11(stat + syst)±0.07(theory3) and a limit \Vtb\ > 0.71 at the 95% C.L. DO, from the cross 
section measurement at mj = 170 GeV/c^, obtains \Vtbfi\ = 1.07 ± 0.12(stat -|- syst -|- theory) 
and a limit of \Vth\ > 0.78 at the 95% C.L. A fiat prior in \Vti,\'^ from to 1 is assumed for the 
95% CL limit results. 



8 Conclusions 

In summary, both the CDF and DO collaborations have developed several multivariate analysis 
techniques to distinguish single top signal from background events and have combined them 
to precisely measure the electroweak single top production cross section and the CKM matrix 
element \Vtb\- Single top production has been observed for the first time by both collaborations, 
CDF and DO, with a significance of 5.0 standard deviations. More details can be found 
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